Energy-Effective Instruction Fetch Unit for Wide Issue Processors

نویسندگان

  • Juan L. Aragón
  • Alexander V. Veidenbaum
چکیده

Continuing advances in semiconductor technology and demand for higher performance will lead to more powerful, superpipelined and wider issue processors. Instruction caches in such processors will consume a significant fraction of the on-chip energy due to very wide fetch on each cycle. This paper proposes a new energy-effective design of the fetch unit that exploits the fact that not all instructions in a given I-cache fetch line are used due to taken branches. A Fetch Mask Determination unit is proposed to detect which instructions in an I-cache access will actually be used to avoid fetching any of the other instructions. The solution is evaluated for a 4-, 8and 16-wide issue processor in 100nm technology. Results show an average improvement in the Icache Energy-Delay product of 20% for the 8-wide issue processor and 33% for the 16-wide issue processor for the SPEC2000, with no negative impact on performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instruction Pre-Processing in Trace Processors

In trace processors, a sequential program is partitioned at run time into “traces.” A trace is an encapsulation of a dynamic sequence of instructions. A processor that uses traces as the unit of sequencing and execution achieves high instruction fetch rates and can support very wide-issue execution engines. We propose a new class of hardware optimizations that transform the instructions within ...

متن کامل

An Exploration Of Instruction Fetch Requirement In Out-of-order Superscalar Processors

Automated design of superscalar processors can provide future in terms a cycles-per-instruction (CPI) using the application program statistics and the 124, Optimization of Instruction Fetch Mechanisms for High Issue Rates 117, A first-order superscalar processor model Karkhanis, Smith 2004 (Show Context). Because superscalar architectures include complicated control logic for out-of-order execu...

متن کامل

Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures - Microarchitecture, 1996., IEEE/ACM International Symposium on

To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers offunctional units. Instruction fetch rate must also be increased in order to effectively exploit the performance potential of such processors. Block-structured ISAs provide an effective means of increasing the instruction fetch rate. We define an optimization, calle...

متن کامل

Block Based Fetch Engine for Superscalar Processors

The implementation of modern high performance computer is increasingly directed toward parallelism in the hardware. However, most of the current fetch units are limited to one branch prediction per cycle and therefore, can fetch no more than one basic block per cycle. While fetching a single basic block each cycle is sufficient for implementations that issue small number of instructions per cyc...

متن کامل

On Augmenting Trace Cache for High-Bandwidth Value Prediction

Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction and speculatively executes its data-dependent instructions based on the predicted outcome. As the instruction fetch rate and issue rate of processors increase, the potential data dependences among instructions issued in the same cycle also increase. Value prediction and speculative exec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005